Goto

Collaborating Authors

 Nizhny Novgorod Oblast


StableTrack: Stabilizing Multi-Object Tracking on Low-Frequency Detections

Shelukhan, Matvei, Mamedov, Timur, Kvanchiani, Karina

arXiv.org Artificial Intelligence

Multi-object tracking (MOT) is one of the most challenging tasks in computer vision, where it is important to correctly detect objects and associate these detections across frames. Current approaches mainly focus on tracking objects in each frame of a video stream, making it almost impossible to run the model under conditions of limited computing resources. T o address this issue, we propose Stable-Track, a novel approach that stabilizes the quality of tracking on low-frequency detections. Our method introduces a new two-stage matching strategy to improve the cross-frame association between low-frequency detections. W e propose a novel Bbox-Based Distance instead of the conventional Mahalanobis distance, which allows us to effectively match objects using the Re-ID model. Furthermore, we integrate visual tracking into the Kalman Filter and the overall tracking pipeline. Our method outperforms current state-of-the-art trackers in the case of low-frequency detections, achieving 11.6% HOTA improvement at 1 Hz on MOT17-val, while keeping up with the best approaches on the standard MOT17, MOT20, and DanceTrack benchmarks with full-frequency detections.


Russia infiltrates Pokrovsk with new tactics that test Ukraine's drones

Al Jazeera

Is Trump losing patience with Putin? Will sanctions against Russian oil giants hurt Putin? Russian forces have spread rapidly through Pokrovsk, the city in Ukraine's east where the warring sides have concentrated their manpower and tactical ingenuity during the past week, in what may be a final culmination of a 21-month battle. Geolocated footage placed Russian troops in central, northern and northeastern Pokrovsk, said the Institute for the Study of War (ISW), a Washington-based think tank. It set its sights on the city almost two years ago, after capturing Avdiivka, 39km (24 miles) to the east.


Russia-Ukraine war: List of key events, day 1,350

Al Jazeera

Is Trump losing patience with Putin? Will sanctions against Russian oil giants hurt Putin? Russian and Ukrainian troops have fought battles in the ruins of Pokrovsk, a transport and logistics hub in eastern Ukraine, with Ukraine's military reporting fierce fighting under way in a part of the city that was key for Kyiv's front-line logistics. Ukrainian President Volodymyr Zelenskyy said he visited troops fighting near the eastern city of Dobropillia, where Ukrainian forces are conducting a counteroffensive against Russian troops. Russia struck civilian energy and port infrastructure in a massive overnight drone attack on Ukraine's southern region of Odesa, the region's governor said in a post on the Telegram messaging app, adding that rescuers extinguished fires and there were no casualties.


Zelensky visits troops near embattled front line town of Pokrovsk

BBC News

Ukrainian President Volodymyr Zelensky says he has visited troops near the town of Pokrovsk, where the fiercest front line battle between Russia and Ukraine is currently taking place. Zelensky posted photos showing him meeting personnel at a command post in the Dobropillya sector, some 20km (12 miles) north of Pokrovsk in the Donetsk region. Kyiv's top military commander, Oleksandr Syrskiy, said on Monday that Ukraine was increasing pressure on the Dobropillya front to force the enemy to disperse its forces and make it impossible to concentrate their main efforts in the Pokrovsk area. Russia has been trying to seize Pokrovsk - a strategic frontline town and logistic hub - for over a year. Although it has taken them months to approach the town's borders, Russian soldiers have now infiltrated it and on Friday, Zelensky said Russia had amassed 170,000 troops on its outskirts.


Explainable artificial intelligence model predicting the risk of all-cause mortality in patients with type 2 diabetes mellitus

Vershinina, Olga, Sabbatinelli, Jacopo, Bonfigli, Anna Rita, Colombaretti, Dalila, Giuliani, Angelica, Krivonosov, Mikhail, Trukhanov, Arseniy, Franceschi, Claudio, Ivanchenko, Mikhail, Olivieri, Fabiola

arXiv.org Artificial Intelligence

Objective. Type 2 diabetes mellitus (T2DM) is a highly prevalent non-communicable chronic disease that substantially reduces life expectancy. Accurate estimation of all-cause mortality risk in T2DM patients is crucial for personalizing and optimizing treatment strategies. Research Design and Methods. This study analyzed a cohort of 554 patients (aged 40-87 years) with diagnosed T2DM over a maximum follow-up period of 16.8 years, during which 202 patients (36%) died. Key survival-associated features were identified, and multiple machine learning (ML) models were trained and validated to predict all-cause mortality risk. To improve model interpretability, Shapley additive explanations (SHAP) was applied to the best-performing model. Results. The extra survival trees (EST) model, incorporating ten key features, demonstrated the best predictive performance. The model achieved a C-statistic of 0.776, with the area under the receiver operating characteristic curve (AUC) values of 0.86, 0.80, 0.841, and 0.826 for 5-, 10-, 15-, and 16.8-year all-cause mortality predictions, respectively. The SHAP approach was employed to interpret the model's individual decision-making processes. Conclusions. The developed model exhibited strong predictive performance for mortality risk assessment. Its clinically interpretable outputs enable potential bedside application, improving the identification of high-risk patients and supporting timely treatment optimization.


Emergence of hybrid computational dynamics through reinforcement learning

Kononov, Roman A., Pospelov, Nikita A., Anokhin, Konstantin V., Nekorkin, Vladimir V., Maslennikov, Oleg V.

arXiv.org Artificial Intelligence

Understanding how learning algorithms shape the computational strategies that emerge in neural networks remains a fundamental challenge in machine intelligence. While network architectures receive extensive attention, the role of the learning paradigm itself in determining emergent dynamics remains largely unexplored. Here we demonstrate that reinforcement learning (RL) and supervised learning (SL) drive recurrent neural networks (RNNs) toward fundamentally different computational solutions when trained on identical decision-making tasks. Through systematic dynamical systems analysis, we reveal that RL spontaneously discovers hybrid attractor architectures, combining stable fixed-point attractors for decision maintenance with quasi-periodic attractors for flexible evidence integration. This contrasts sharply with SL, which converges almost exclusively to simpler fixed-point-only solutions. We further show that RL sculpts functionally balanced neural populations through a powerful form of implicit regularization -- a structural signature that enhances robustness and is conspicuously absent in the more heterogeneous solutions found by SL-trained networks. The prevalence of these complex dynamics in RL is controllably modulated by weight initialization and correlates strongly with performance gains, particularly as task complexity increases. Our results establish the learning algorithm as a primary determinant of emergent computation, revealing how reward-based optimization autonomously discovers sophisticated dynamical mechanisms that are less accessible to direct gradient-based optimization. These findings provide both mechanistic insights into neural computation and actionable principles for designing adaptive AI systems.


AI-Facilitated Analysis of Abstracts and Conclusions: Flagging Unsubstantiated Claims and Ambiguous Pronouns

Markhasin, Evgeny

arXiv.org Artificial Intelligence

We present and evaluate a suite of proof-of-concept (PoC), structured workflow prompts designed to elicit human-like hierarchical reasoning while guiding Large Language Models (LLMs) in the high-level semantic and linguistic analysis of scholarly manuscripts. The prompts target two non-trivial analytical tasks within academic summaries (abstracts and conclusions): identifying unsubstantiated claims (informational integrity) and flagging semantically confusing ambiguous pronoun references (linguistic clarity). We conducted a systematic, multi-run evaluation on two frontier models (Gemini Pro 2.5 Pro and ChatGPT Plus o3) under varied context conditions. Our results for the informational integrity task reveal a significant divergence in model performance: while both models successfully identified an unsubstantiated head of a noun phrase (95% success), ChatGPT consistently failed (0% success) to identify an unsubstantiated adjectival modifier that Gemini correctly flagged (95% success), raising a question regarding the potential influence of the target's syntactic role. For the linguistic analysis task, both models performed well (80-90% success) with full manuscript context. Surprisingly, in a summary-only setting, Gemini's performance was substantially degraded, while ChatGPT achieved a perfect (100%) success rate. Our findings suggest that while structured prompting is a viable methodology for complex textual analysis, prompt performance may be highly dependent on the interplay between the model, task type, and context, highlighting the need for rigorous, model-specific testing.Keywords: AI-assisted, AI-powered, AI-enhanced, automated, machine learning, academic summary.


Efficient Latent Semantic Clustering for Scaling Test-Time Computation of LLMs

Lee, Sungjae, Kim, Hoyoung, Hwang, Jeongyeon, Park, Eunhyeok, Ok, Jungseul

arXiv.org Artificial Intelligence

Scaling test-time computation--generating and analyzing multiple or sequential outputs for a single input--has become a promising strategy for improving the reliability and quality of large language models (LLMs), as evidenced by advances in uncertainty quantification and multi-step reasoning. A key shared component is semantic clustering, which groups outputs that differ in form but convey the same meaning. Semantic clustering enables estimation of the distribution over the semantics of outputs and helps avoid redundant exploration of reasoning paths. However, existing approaches typically rely on external models, which introduce substantial computational overhead and often fail to capture context-aware semantics. We propose Latent Semantic Clustering (LSC), a lightweight and context-sensitive method that leverages the generator LLM's internal hidden states for clustering, eliminating the need for external models. Our extensive experiment across various LLMs and datasets shows that LSC significantly improves the computational efficiency of test-time scaling while maintaining or exceeding the performance of existing methods.


LLM Context Conditioning and PWP Prompting for Multimodal Validation of Chemical Formulas

Markhasin, Evgeny

arXiv.org Artificial Intelligence

Identifying subtle technical errors within complex scientific and technical documents, especially those requiring multimodal interpretation (e.g., formulas in images), presents a significant hurdle for Large Language Models (LLMs) whose inherent error-correction tendencies can mask inaccuracies. This exploratory proof-of-concept (PoC) study investigates structured LLM context conditioning, informed by Persistent Workflow Prompting (PWP) principles, as a methodological strategy to modulate this LLM behavior at inference time. The approach is designed to enhance the reliability of readily available, general-purpose LLMs (specifically Gemini 2.5 Pro and ChatGPT Plus o3) for precise validation tasks, crucially relying only on their standard chat interfaces without API access or model modifications. To explore this methodology, we focused on validating chemical formulas within a single, complex test paper with known textual and image-based errors. Several prompting strategies were evaluated: while basic prompts proved unreliable, an approach adapting PWP structures to rigorously condition the LLM's analytical mindset appeared to improve textual error identification with both models. Notably, this method also guided Gemini 2.5 Pro to repeatedly identify a subtle image-based formula error previously overlooked during manual review, a task where ChatGPT Plus o3 failed in our tests. These preliminary findings highlight specific LLM operational modes that impede detail-oriented validation and suggest that PWP-informed context conditioning offers a promising and highly accessible technique for developing more robust LLM-driven analytical workflows, particularly for tasks requiring meticulous error detection in scientific and technical documents. Extensive validation beyond this limited PoC is necessary to ascertain broader applicability.Keywords: AI-assisted, AI-powered, AI-enhanced, automated, knowledge engineering, machine learning.


TUMLS: Trustful Fully Unsupervised Multi-Level Segmentation for Whole Slide Images of Histology

Rehamnia, Walid, Getmanskaya, Alexandra, Vasilyev, Evgeniy, Turlapov, Vadim

arXiv.org Artificial Intelligence

Digital pathology, augmented by artificial intelligence (AI), holds significant promise for improving the workflow of pathologists. However, challenges such as the labor-intensive annotation of whole slide images (WSIs), high computational demands, and trust concerns arising from the absence of uncertainty estimation in predictions hinder the practical application of current AI methodologies in histopathology. To address these issues, we present a novel trustful fully unsupervised multi-level segmentation methodology (TUMLS) for WSIs. TUMLS adopts an autoencoder (AE) as a feature extractor to identify the different tissue types within low-resolution training data. It selects representative patches from each identified group based on an uncertainty measure and then does unsupervised nuclei segmentation in their respective higher-resolution space without using any ML algorithms. Crucially, this solution integrates seamlessly into clinicians workflows, transforming the examination of a whole WSI into a review of concise, interpretable cross-level insights. This integration significantly enhances and accelerates the workflow while ensuring transparency. We evaluated our approach using the UPENN-GBM dataset, where the AE achieved a mean squared error (MSE) of 0.0016. Additionally, nucleus segmentation is assessed on the MoNuSeg dataset, outperforming all unsupervised approaches with an F1 score of 77.46% and a Jaccard score of 63.35%. These results demonstrate the efficacy of TUMLS in advancing the field of digital pathology.